Synopsis: Entity-Attribute-Value

Let's have a look at the triviality of Entity-Attribute-Value in a database.

“How do we count the number of rows by date?” This is an example of a simple task for a database programmer. It involves basic SQL syntax:

Counting the number of rows by date

However, the simple solution assumes two things:

  • Values are stored in the same column, as in Bugs.date_reported.

  • Values can be compared to one another so that GROUP BY can accurately group dates with equal values together.

What if we can’t rely on those assumptions? What if the date is stored in the date_reported or report_date column, or in any other column that’s different on each of its rows. What if dates can take a variety of different formats and the computer can’t easily compare two dates?

We may encounter these problems and others when we employ the antipattern known as Entity-Attribute-Value (EVA).

Objective: Support variable attributes#

Extensibility is frequently a goal of software projects. We would like to design software that can adapt fluidly to future usage with little or no additional programming.

This is not a new problem; similar arguments against the inflexibility of relational database metadata have been made almost continuously since 1970, when the relational model was first proposed in A Relational Model of Data for Large Shared Data Banks by E. F. Codd.

A conventional table consists of attribute columns relevant for every row in the table since every row represents an instance of a similar object. A different set of attributes represents a different object type, so it belongs in a different table.

In modern object-oriented programming models, however, different object types can be related, for instance, by extending the same base type. In object-oriented design, these objects are considered instances of the same base type, as well as instances of their respective subtypes. We would like to store objects as rows in a single database table to simplify comparisons and calculations over multiple objects. But we also need to allow objects of each subtype to store their respective attribute columns, which may not apply to the base type or other subtypes.

Let’s use an example from our bugs database. In the figure “Object-oriented class diagram for bug types," we can see that a Bug and a Feature Request share some attributes in common, as seen in the Issue base type. Every issue is associated with a person who reported it. It’s also associated with a product and it has a priority for completion. However, a bug has some distinct attributes: the version of the product in which the bug occurs and the severity or impact of the bug. Likewise, a FeatureRequest may have its own attributes as well. For this example, let’s suppose that a feature is associated with a sponsor whose budget supports that feature’s development.

Object-oriented class diagram for bug types

Legitimate uses of the antipattern#

It’s hard to justify using the EAV antipattern in a relational database. We have to compromise too many important features of the relational paradigm. Still, there are instances when it is legitimately needed in some applications to support dynamic attributes.

Most applications that need schemaless data need it for only a few tables or even just one table. The rest of our data requirements conform to standard table designs. If you need to account for the extra work and risk of EAV in your project plan, it may be the lesser evil to just use it sparingly. But we need to keep in mind the reports of experienced database consultants who say that systems using EAV become unwieldy within a year.

If we have non-relational data management needs, the best answer is to use a non-relational technology. Going into details of alternate technologies is beyond the scope of this course; however, here is a non-exhaustive list of a few of these technologies:

Many other non-relational projects are also emerging. However, the weaknesses of EAV relative to relational databases also apply to these alternatives. When metadata is fluid, it’s harder to formulate simple queries. Applications spend a lot of energy discovering the structure of data and adapting to it.

Untitled Masterpiece
Antipattern: Use a Generic Attribute Table
Mark as Completed
Report an Issue